4 research outputs found

    A Large-Vocabulary Bilingual Speech Recognition System for Chinese and Japanese Language

    Get PDF

    A Mandarin Voice Organizer Based on a Template-Matching Speech Recognizer

    Get PDF

    A Large-Vocabulary Bilingual Speech Recognition System for Chinese and Japanese Language

    No full text
    Bilingual or Multilingual speech recognition gradually becomes an attractive research topic because bilingual writings appear almost everywhere in present day. In this paper, we propose a continuous word-based speech recognition system to dictate the Mandarin and Japanese speech simultaneously. We find that there are about 62 basic phoneme like units(PLUs) among the mixed Mandarin and Japanese syllables. The 62 HA/Ms are used to decode the input speech into word hypotheses based on a fast tree-beam searching algorithm. In the language model, the bigram model and trigram model are used to select the most likely word from the word candidates. We also have a bilingual dictionary to deal with the cross language information. Our proposed system architecture can not only dictate Mandarin and Japanese speech simultaneously but also provide a possible solution to recognize any other bilingual speech. I

    A Mandarin Voice Organizer Based on a Template-Matching Speech Recognizer

    No full text
    On the observation of current available voice organizers, all of them accept only voice commands or word-based commands. Using natural spoken language to operate organizer is still a difficult problem. In this paper, a template-based speech recognizer which accepts near(constrained) spoken language is proposed. Since the template-based recognizer is a domain-dependent speech recognition system, representing and matching of sentence templates become the main tasks of the recognizer. We use finate state networks(FSNs) to represent the sentence templates and propose a vowel-based, syllable-scoring method to match a correct template. By replacing the template sets, this method can be easily applied to other domains. Besides, two main functions, voice recording and voice message query, are implemented on our organizer using a fast CELP encoder/decoder to compress/decompress the voice data in realtime. Experimental results shows that the collected 31 sentence templates can greatly improve the voice interface between the user and the voice organizer. 1
    corecore